Conference Proceedings
Can deep effectiveness metrics be evaluated using shallow judgment pools?
X Lu, A Moffat, J Shane Culpepper
Association for Computing Machinery (ACM) | Published : 2017
Abstract
© 2017 Copyright held by the owner/author(s). Increasing test collection sizes and limited judgment budgets create measurement challenges for IR batch evaluations, challenges that are greater when using deep effectiveness metrics than when using shallow metrics, because of the increased likelihood that unjudged documents will be encountered. Here we study the problem of metric score adjustment, with the goal of accurately estimating system performance when using deep metrics and limited judgment sets, assuming that dynamic score adjustment is required per topic due to the variability in the number of relevant documents. We seek to induce system orderings that are as close as is possible to t..
View full abstractGrants
Awarded by Australian Research Council
Funding Acknowledgements
This work was supported by the Australian Research Council's Discovery Projects Scheme (DP140103256 and DP170102231).